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ON THE TWO-PHASE FRAMEWORK FOR JOINT MODEL AND 
DESIGN-BASED INFERENCE 

By Susana Rubin-Bleuer^ and Ioana Schiopu Kratina 

Statistics Canada 

We establish a mathematical framework that formally validates 
the two-phase "super-population viewpoint" proposed by Hartley and 
Sielken [Biometrics 31 (1975) 411-422] by defining a product prob- 
ability space which includes both the design space and the model 
space. The methodology we develop combines finite population sam- 
pling theory and the classical theory of infinite population sampling 
to account for the underlying processes that produce the data un- 
der a unified approach. Our key results are the following: first, if the 
sample estimators converge in the design law and the model statis- 
tics converge in the model, then, under certain conditions, they are 
asymptotically independent, and they converge jointly in the product 
space; second, the sample estimating equation estimator is asymptot- 
ically normal around a super-population parameter. 

1. Introduction. Classical sampling theory concerns inference for finite 
population parameters. For the finite population mean Y = J2iLiyi/N , in- 
ference typically considers the interval 

[y±tpse{y)], 

where tp is a constant chosen with a normal or Student distribution in 
mind, and se{y) denotes the standard error of the sample mean y (see [13]). 
The expression above means that Y is within the interval [y — tpse{y), y -\- 
tpse{y)] with some degree of confidence. Here A'^ is the size of the finite 
population, the yiS are considered nonstochastic but unknown numbers and 
probability statements arise from the selection of units in the sample. No 
distributional assumptions are made about the y^'s. This nonparametric 
approach to inference is often called design-based inference. 
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However, there are many situations when we have to resort to postulat- 
ing a model. For descriptive analysis in a finite population, we need a model 
when we have to deal with nonresponse, small area estimation or measure- 
ment errors. For studies involving scientific questions, the parameters of 
stochastic models are sometimes of more interest than finite population pa- 
rameters. For example, in longitudinal surveys we are interested in modeling 
the dependencies between health status and certain socio-economic covari- 
ates. Staying within the finite population framework limits our inference to 
the reference population only. To illustrate the issue, let us take the sample 
mean y obtained from a sample of size n, and suppose that we wish to draw 
conclusions on a more general population than the finite population from 
which we obtained the sample: we view y as an estimator of the model mean 
fj,. We have 

(1.1) V^{y -^x) = ^{y -Y) + ^{n/N)VN{Y - 

The design-based large sample properties of the first term on the right- 
hand side of (1.1) have been studied for many sampling designs. Conditions 
were given for the asymptotic normality of the sample mean (design-based 
central limit theorem, or CLT): for simple random sampling without replace- 
ment (SRSWOR) and rejective sampling with varying probabilities by Hajek 
[8, 9], for probability proportional to size without replacement (irps) designs 
by Rosen [19, 20], and for stratified multistage probability proportional to 
size with replacement (PPSWR) designs by Krewski and Rao [15]. For de- 
scriptions of these and other sampling designs, see, for example, [28]. To 
derive a design-based CLT for the left-hand side of (1.1), we would have to 
assume not only that the sampling rate n/N converges to zero, but also that 
the sequence of numbers \/lV{Y — /x) is bounded as — > oo. As a sequence 
of numbers, this last condition is very restrictive. However, as a sequence of 
sums of independent, identically distributed (i.i.d.) random variables (r.v.) 
in the super-population, \/iV {Y — fi) is bounded in probability and the sec- 
ond term of the right-hand side of (1.1) converges to zero in the probability 
of the model when n/N converges to zero. To study the asymptotic proper- 
ties of the survey sample means around the model mean, it is necessary to 
include the model and the design in the same probability space. 

In this article we first construct a product space, which is a mathematical 
framework for joint design-based and model-based inference. Our key results 
are quite general. First, we show that, under certain conditions, if the sur- 
vey sample estimators converge in the law of the sampling design and the 
associated model statistics converge in the law of the model (not necessarily 
to a Gaussian distribution), then they are asymptotically independent and 
they converge jointly in the product space. Second, we show that a survey 
sample estimator of a model parameter, which is derived from a very general 
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sampling estimating equation, exists, is consistent and is asymptotically nor- 
mal. Hartley and Sielken [11] introduced the "super-population" approach to 
describe the relationship between the infinite population (also called super- 
population) and the finite population from which we select the sample. Many 
authors worked within the two-phase framework and accounted for the vari- 
ability due to the design and the model by means of the "anticipated vari- 
ance." The contributions of Fuller [6], Isaki and Fuller [12], Godambe and 
Thompson [7], Korn and Graubard [14], Pfeffermann and Sverchkov [17], 
Binder and Roberts [4], Rodriguez [18] and Molina, Smith and Sugden [16] 
are just a few among the many on the subject. 

Fuller [6] established large sample properties of the sample regression 
estimator around the model parameter with data obtained from stratified 
cluster samples. His approach could only be applied to stratified SRSWOR 
designs in the first stage. Our general approach to estimation of model pa- 
rameters extends Fuller [6] to more general designs and estimators, even if 
the sampling rate is nonnegligible. 

The formal expression of the product space, together with the key results 
described above, establish a general and unified methodology that accommo- 
dates the diverse techniques of these authors, and has enabled us to extend 
some of their results. Moreover, the more formal aspects of the methodol- 
ogy (sub-(T-fields, filtrations in the product space) proved to be essential for 
adapting counting process methodology to the analysis of survival survey 
data (see [23, 24]). In addition, the design-based distribution of a sample 
estimator is a "second phase" concept, that is, a conditional distribution 
given the minimal information in the model. In general, we could apply this 
methodology to most situations where we have a two phase randomization 
process. 

The joint design-model distribution of the sample data is also called the 
distribution of the sample variables (see [17]). We present other results 
that refer to the sample variables under the posterior distribution given 
the sample labels. For a sequence of random variables Y = {Yi, . . . ,Yn}, it 
is well known that the posterior distribution of Y given the sample out- 
come {{i,Yi = Ui), i S So} depends only on the sample sq actually drawn 
and not on the sampling design used to draw it, provided that so and Y 
are stochastically independent given the design variables (see [29]). Here, 
however, we look at conditioning just on the sq actually drawn. We also 
show that the posterior distribution depends only on sq. Note that whether 
the labels are repeated or not is a consequence of the design. If the sample 
So from a with replacement (WR) sampling design has repeated labels, the 
sample variables under the posterior distribution given so are not stochas- 
tically independent even if the original components of Y were independent. 
It is also well known that an SRSWOR from a finite population, which was 
generated by a super-population, when viewed as a sample from the infinite 
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space inherits the same properties of the random variables which generated 
the finite population. Fuller [6] applied the CLT to the array of variables 
from an SRSWOR design to obtain the asymptotic distribution of the sample 
regression estimator. It is not clear why the classical CLT could be applied 
in [6] without further assumptions. In this article, we show formally that his 
array of sample variables, not necessarily nested, consists of i.i.d. variables 
under the posterior distribution given the sample labels. For the CLT to hold 
for this array, we only require that the original super-population variables 
have a finite variance. 

In order to obtain the total (anticipated) variance in (1.1), we must im- 
pose (model-based) conditions on the super-population model, which survey 
statisticians would rather avoid. At the very least, some form of model- 
based independence is needed. Many authors assume that the sampling rate 
is small enough so they can ignore the variation due to the model compo- 
nent. However, Korn and Graubard [14] show that we should not dismiss 
the second term in the total variance without checking first that it is indeed 
sufficiently small relative to the first term. 

The article is organized as follows. In Sections 2-5 we develop the tools 
needed to incorporate the design and the model in the same space. Sec- 
tion 6 is an application of the product space methodology. In Section 2 we 
modify somewhat the usual definitions of sample design, finite population 
parameter and sample estimator to enable us to view them as random vari- 
ables in the super-population (Definition 4.2, Remark 4.2). In Section 3 we 
adopt the super-population definition in [28] to define what it means for 
a finite population to be generated by a super-population (Definition 3.1). 
Proposition 3.1 shows how conditions needed for the design-based CLT fol- 
low from simple conditions in the super-population. In Section 4 we define 
the general product space (Definitions 4.1, 4.3) and show how stochastic de- 
pendence is introduced in the product space (Example 4.1). We exploit the 
additional information on the design and the model by calculating posterior 
distributions and we study the interplay between dependence and indepen- 
dence of random variables viewed in the design space, the product space or 
the model space (Example 4.1 and Proposition 4.2). In Section 5 we show 
that, if the sample and super-population statistics converge in law in their 
respective spaces, they also converge in law in the product space. The two 
terms in the right-hand side of (1.1) are not, in general, stochastically in- 
dependent. We establish here their "asymptotic independence" under mild 
conditions in Theorem 5.1. Example 5.1 yields the asymptotic normality 
of the ratio estimator of the weighted average of the strata means under a 
stratified one-stage PPSWR design. In Section 6 we establish the existence 
and asymptotic normality of a sample estimator derived from a general es- 
timating equation, under general conditions. Example 6.1 is an application 
to a two-stage sampling design. 
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2. Finite populations and sampling designs. 

Definition 2.1. A finite population [/ = {!,... ,N} of size N consists 
of N labels, with their associated data, that is, each unit i is associated 
to a unique vector (yj, Xi,Zi), i = 1, . . . , N. Here yi G MP, Xi E M.^ represent, 
respectively, the characteristics of interest and the auxiliary information, and 
Zi e is the " prior" information available at the time of the design of the 
survey on all units i = I, . . . , N . We write = {yi)i=i,...,N, = (a^j)i=i,...,Af 
and = {zi)i=i^,„^N- 

Remark 2.1. In this paper N will denote the size of the finite popu- 
lation (i.e., the number of ultimate sampling units in the population) for 
one-stage-sampling schemes, and it will denote the number of clusters or 
primary sampling units (p.s.u.s) for multistage schemes, in which case the 
size of the finite population will be denoted by M. 

Definition 2.2. A sample is the realization of a probabilistic (random- 
ized) selection or sampling scheme ([28], page 25). We adopt the compre- 
hensive definition of a sample in [10], page 42: it views the sample as "a 
finite sequence of units or labels of the finite population, which are drawn 
one by one until the sampling is finished according to some stopping rule. 
This sequence distinguishes the order of units, may be of variable length 
and may include one unit of the finite population several times." This def- 
inition includes samples selected without replacement (WOR) and WR. In 
what follows, we do not require that samples be selected sequentially, but, 
for convenience, we may consider an order in which the n sampled units are 
either observed or selected. 

In the literature, a design p associated with a sampling scheme is a prob- 
ability function on the set of all possible samples under this scheme (see, 
e.g., [28]). The definition of a sampling design given below requires measur- 
ability of p as a function of the variables containing the prior information. 
The same holds for Definition 2.4 of a finite population parameter (cf. [28], 
page 39). The measurability conditions ensure that, when the finite popu- 
lation is generated by a super-population, the finite population parameter 
and the estimator are real-valued measurable functions (random variables) 
defined on the probability space associated with the super-population (see 
Definition 3.1). 

Definition 2.3. Let U be the finite population of Definition 2.1. Given 
a sampling scheme, let S be the set of all possible samples under the scheme. 
Let C{S) consist of all subsets of S. A sampling design associated to a 
sampling scheme is a function p: C{S) x M^^^ — > [0, 1] such that: 
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(i) for all s in S, p{s,-) is Borel-measurable in R^^.^ ; 

(ii) for z^GRf ^, is a probability measure on C{S). 

We say that {S,C{S),p) is a design probability space, where pis,-) > 0, 
s€S. 

Remark 2.2. For the sake of simplicity, in all applications we will take 
q=l. 

Remark 2.3. For a one-stage Poisson sampling scheme, the collection 
S of all possible samples is completely determined given only the sizes of the 
strata in the population. For other sampling schemes, 5 cannot be deter- 
mined unless we know the strata and sample sizes, and possibly other param- 
eters, depending on the sampling scheme. Under a one-stage irps scheme, S 
can be defined without prior knowledge of the unit sizes. Under a first-stage 
irps scheme and a second stage SRSWOR scheme, we cannot completely 
determine S unless we know the first-stage unit sizes, since they are the 
second-stage population sizes. 

Definition 2.4. Consider a finite population as in Definition 2.1. A 
finite population parameter ^^r is a Borel-measurable function defined on 
a subset of m(p+'^+'J)^'^. An estimator of this finite population parameter 
associated with a design, also called a sample estimator, is a function 9^ : S x 
]^{p+k+q)xN ^-^^ where ^7v(s,-) is Borel-measurable. 

In the next example we define the fundamental notation used by Krewski 
and Rao [15], which we will use subsequently (e.g., in Proposition 3.1). 

Example 2.1 (Stratified two-stage PPSWR [15]). Let Nh be the num- 
ber of p.s.u.s in stratum h, Mhi be the number of ultimate units in p.s.u. hi, 
i = 1, . . . , Nfi, h= 1, . . . ,L, and L the number of strata. Let N = J2h=i 
Mfi = ^^hi and M = X]fe=i ^^h- The prior information consists of the 
"sizes" Zhi = Mhi, i = 1, . . . h = 1, . . . , L. Suppose n/j > 2 p.s.u.s are 
selected with replacement in stratum h with probabilities phi = M^i/Mh, 
i = 1, . . . , Nh, h = 1, . . . , L at each draw. The selection is independent in 
each stratum, and independent second stage samples are taken within those 
p.s.u.s selected more than once. The finite population mean is 6^ = J2h=i ^hdh, 
where Wh = Mh/M is the stratum weight, 6h = Yh = Yl!i=i Vhi/Mh is the fi- 
nite population stratum mean and Uhi is the total of p.s.u. hi, i = 1, ... , Nh, 
h = I, . . . , L. Let I^- = 1 if p.s.u. hi is selected in the sample at the kth draw 
in stratum h and otherwise, k = 1, . . . ,nh, i = 1, . . . , Nh, h = 1, . . . , L. If 
the cluster hi is selected at the kth. draw, let yhi be an unbiased estimator 
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of the total yhi based on sampling at the second stage and set yhi = oth- 
erwise, k = 1, . . . ,nh, i = 1, . . . , Nh, h = I, . . . , L. For stratum h, we consider 
the estimator 9h = Efc=i O^rih, where = j:f=i VhAjMM, k = l,...,nh, 
i = 1,. . . ,Nfi, h = 1, . . . ,L. Finally, a design-unbiased sample estimator of 
On is ^jv(y^,M^) = ELiW^A. 

We often refer to conditions Ci to C3 of Yung and Rao [31], which evolved 
from conditions introduced by Krewski and Rao [15] for the asymptotic 
normality of the sample mean 9^ (see the Appendix). 

3. Super-populations. 

Definition 3.1. Consider a finite population U of size N as in Defi- 
nition 2.1. A super-population associated with it consists of a probability 
space {n,^,P) and random vectors {Yi,Xi,Zi), Yi-.Q^W, Xi-.Q^R'^, 
Zi'.Q.^ M"^, such that li(u;o) = Vi-, Xiiuoo) = Xi, Zi{uJo) = zi, for some ojq G 0, 
i = 1, . . . , A^. We write = {yi)i=i,...,N and define A^ and similarly. 
We say that [/ is a realization of or is generated by the super-population. 
Any distribution of (Y^ , Z'^) that is given a priori is called a super- 
population model. We note that different outcomes lo can generate the same 
finite population. 

Definition 3.1 is similar to the definition given in [28], page 533. We assume 
throughout this work that N is not random. In what follows, the subscript 
"d" refers to design randomization and "m" refers to the randomization on 
the probability space {Q,^,P). We use Em, Vm to denote, respectively, the 
expectation and variance with respect to the probability space (fi, P). We 
use the standard notation (j{X) for the cj-field generated by the function X 
(see also Definition 3.1 in [22] or [25]). 

Example 3.1 (Two-stage super-population model). Let O be the con- 
ceptual population of people living in a country. Suppose it is composed 
of L disjoint strata of units hi, i = 1, . . . , A/j, h = 1, . . . , L, where unit hi 
represents a cluster of individuals. Let (^},^,P) be the corresponding prob- 
ability space. Now we assume that Z^i are discrete r.v.s on the probability 
space that represent the number of individuals that live in cluster hi. We 
are interested in characteristics Yhij pertaining to the individuals labelled 
by hij, living in cluster hi, i = 1, . . . , N^, h = 1, . . . , L. In order to be able 
to define the super-population according to Definition 3.1, we must know 
an outcome of the Zhi, say, the sizes of the clusters of the population exist- 
ing right now. Let Fm = {uj£Q: Zhi{uj) = Mhi, i = l,...,Nh, h=l,...,L}. 
We use this information to define the super-population model by condition- 
ing on the (7-field generated by the event Fm- The conditional probability 
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measure is given by Pm{F,ujq) = P{F\Fm) if i^o £ Fm for F £ ^ (see [5], 
Equation 3, page 222 and note that P{Fni) > since the r.v.s Zf^i are dis- 
crete). Now we define the super-population on (O, Pm) by random vectors 
Yfiij of p socio-economic characteristics associated with the individual hij, 
Yhij -.n^RP, j = 1,..., Mhi, i = l,...,Nh, h = l,...,L. The cluster totals 

{Yhi = J2j=i Yhij, i = 1, - ■ ■ , Nfi, h = 1, . . . , L} are assumed i.i.d. r.v.s within 
strata. 

We now illustrate how conditions that are sufficient for design-based CLTs 
can be justified as a consequence of simple moment conditions in the super- 
population, which, in turn, can be justified by expert knowledge of the 
model. 

Consider the two-stage super-population model of Example 3.1 and as- 
sume that the total number of clusters N ^ oo. Assume the sampling design 
of Example 2.1, defined on the finite population generated hy uj gQ, where 
Yhi{u}) = X^jl'i Yhij{io), a; G ri. In Proposition 3.1 below we show that mo- 
ment conditions in the super-population yield the Liapunov-type condition 
(C;) similar to ELi WhEd\e'^ - Yh\'^+^ = 0(1) asn^oo, 9^ as in Exam- 
ple 2.1, which is condition Ci of Krewski and Rao [15]. 

Proposition 3.1. Let n = ni + n2 + ■ ■ ■ + til. We assume the model- 
based condition 

L Nh 

(Ml) {l/N)J2Y.^rn\Yhi\^^^ = Oil), 6>0asN^oo. 

h=l i=l 

Then 

(C'l) forallk = l,...,nh,h = l,...,L, ELi " >^/^l'+^H = 0(1), 

for all u! in a set with model probability 1 (a.s. u), N —i-oo, where 6^ is 
the estimator of the stratum mean based on the kth draw in stratum h, 
1 <k < Ufi, h = 1, . . . ,L, defined in Example 2.1. 

The proof is given in the Appendix. Here E^ is the design-based expecta- 
tion and is calculated in the Appendix. Note that Ed\0^\'^'^^ {uj) is a random 
variable in the model space. 

4. The product space. In this section we define a product probability 
space that includes the super-population and the design space, under the 
premise that sample selection and the model characteristic Y are indepen- 
dent given all of the design variables Z. We investigate independence proper- 
ties of the sample variables under the posterior distribution given the sample 
labels and we provide the formal proof of the CLT under the posterior dis- 
tribution for an SRSWOR design. Proposition 4.4 derives the product space 
probability given the model. 
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Definition 4.1. Consider a finite population of size N generated by a 
super-population {Y^ , Z^) as in Definition 3.1. We define the product 
space as the set 5 x with the u-field C{S) x 

Definition 4.2. Consider a super-population associated with a finite 
population as in Definition 4.1. Let p:C{S) x M^""^^ ^ [0,1] be a sampling 
design on the finite population as in Definition 2.3. Then the sampling design 
can be viewed as a random variable on {S x Q,C{S) x ^) defined by 

(4.1) p{s,uj) =p{s,ZN{io)), seS, u; G 

Definition 4.3. We define Pd.m as the cj-additive measure that, on 
elementary rectangles of the product cr-field, has the value 

(4.2) Pd,m.m X F) = / p{s, uj) dP, s G 5, F G ^. 

JF 

Note that each set in C{S) x ^ can be expressed as a finite union of elemen- 
tary rectangles and Pd,m{S x Q) = 1. Hence, Pd,m is a probability measure 
on the product space. If are discrete random variables, we may build the 
product space from the super-population model given Z^ , with the prob- 
ability measure P^(-) = P{-\Fz), = {u : Z^ {uj) = z^}. With P^ replacing 
P in (4.2), we obtain 

Plm{{s]^F)=p{s,z'')-P,{F), seS, Fe^. 

Remark 4.1. Any measurable set in the product space is of the form 
B = [js£s{^} ^ -^s' where some sets Fs G ^ could be empty. By Defini- 
tion 4.3, Pd^m{B) = jQj2sesPi^^^)'^Fai^) dP- We denote the integrand by 
Pd,miB\S X ^){lo) =J2s<=sPi'^i^)^Fsi'-^)- Proposition 4.4 this is a condi- 
tional probability given the cr-field S x ^ . 

Remark 4.2. Let 9m be a sample estimator on the design space with 
associated super- population {Y^,X^,Z^). It can be viewed as a random 
variable on the product space defined by 

(4.3) eN{s,uj) = eN{s,Y^{uj),X^{uj),Z^{uj)), s€S,ujen. 
We omit writing the index when no confusion may arise. 

Definition 4.4 (The sample variables). The components of a sample 
outcome = {yi,i G s}, s £ S, can be viewed as random variables in the 
product space, and following Pfeffermann and Sverchkov [17], we call them 
sample variables. 
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A sample can be written as a sequence of labels i{k), indexed by A; = 
1, . . . ,n, the order in which the labels are observed. Let us define = 1 if 
label i{k) = i and = otherwise. If the sample is drawn sequentially, the 

coincide with the kth draw indicators in Example 2.1. Thus, the sample 
outcome can be written as the sequence of n units, where each coordinate 
k of the sequence represents the y-value for the label i{k), k = 1, . . . ,n: 

/ N N N \ 

\i=i i=i i=i / 

The sample variables can be written as 

TV 

Fi(fc)(s, w) = Y,yj{^)Ij{s), k = l,...,n. 

We will use this notation subsequently. Note that, for WR designs, the labels 
i{k) and i{l) could be the same for k^l. 

Remark 4.3. Assume that the components of are independent ran- 
dom variables. If the design is SRSWOR and the components of are i.i.d. 
in the super-population, the "sample variables" Y^^f^-j, k = 1, . . . ,n, are inde- 
pendent in the product space. However, if the original Y^ are not identically 
distributed, the variables yi(fc), k = 1, . . . ,n, may become stochastically de- 
pendent in the product space. Under a simple random sample with replace- 
ment (SRSWR) design, the variables ^^(fc), k = 1, . . . ,n, are stochastically 
dependent in the product space whether the original super-population vari- 
ables are i.i.d. or not. We refer to the Appendix for an illustration of the 
mechanism. 

Example 4.1 (Stochastic dependence in the product space). Let N = 
n = 2 and Yi, i = 1,2, be i.i.d. r.v.s each with a Bernoulli distribution 
-6(1,0.5). Under simple random sampling (SRS), Pd,m(Yi(^i^ = 1) = Pd,m(Yi(^2) = 
1) = 0.5. Under SRSWR, Pd,„^(y^(l) = 1,^^(2) = 0)' = 0.125 / 0.5 x 0.5 [see 
(A. 2') in the Appendix], whereas under SRSWOR, Pd,m(X^(^i) = 1> ^j(2) = 
0) =0.25 = 0.5 X 0.5. 

Example 4.2 (Two-stage super-population model and two stage de- 
sign). We assume the two-stage super-population model of Example 3.1, 
where we use the size of the clusters of a population existing right now 
to define the model. This minimum necessary information is contained in 
Fm = {w e O : Zhi{u;) = Mm, i = l,...,Nh, h = l,...,L}, where the Mhi are 
cluster sizes as in Example 3.1. We select the sample with probability pro- 
portional to those sizes, but we want to draw conclusions about a more 
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general population than the finite population living in those clusters now. 
We set M^'' = {Mhi)i=i,...,N,^, h = I, . . . , L. Once the model is defined, we 
define a sample space S as the collection of all possible "stratified clustered" 
sequences of units (see Remark 2.3) of a finite population associated with 
the super-population model. Then we define a stratified two-stage sampling 
design p{s, M^^ , ■ ■ ■ , ) with L strata, N clusters and M ultimate units. 
We then construct the space S x with probability measure Pd,m defined 
on the elementary rectangles by Pd,m{s x F) = p{s, M^^ , . . . , M^^ )Pm{F), 
seS, Fe^ (see also Example 4.3' in [27]). 

Consider a sample sq £ S and let a{so x $7) be the four-set sub-field gen- 
erated by So X ri. Let P{-\so) be the conditional probability measure given 
this field. We have the following result. 

Proposition 4.1. For each B = Usg^i*} x ^ ^ C{S), Fg e £^ , we 
have 

(4.4) (i) P{B\so) = Pd,m{so X Fs,)/Pd,miso x n) 

if sq S a, andO otherwise. If, in particular, p{s,uj) does not depend onto £Q, 
and /yi(so) is the value of the indicator function of the set A at sq, we have 

(ii) P{B\so) = P{Fs,)Ia{so). 

Proof, (i) is immediate from [5], Example 1, page 223. Statement (ii) 
follows from (i). □ 

Proposition 4.2 [Stochastic independence of the sample under P(-|so)]. 

Let denote the super-population composed of N independent random vec- 
tors. Assume an SRS design. Under P{-\sq), the k = 1, . . . ,n, variables 
are stochastically independent if there are no repeated labels in the selected 
sample and stochastically dependent otherwise. 

See the Appendix for the proof (see also [26]). 

Example 4.3. Let N = n = 2. Suppose, as in Example 4.1, that Yt, 
i = l,2, are i.i.d. r.v.s distributed as -6(1, 0.5). Assume that we selected sq = 
{1,2} under SRS. This sample has no repeated labels and //(so)-f|(so) = if 
i = j, i,j = 1, 2. Then P{Yi^^) = l\so) = P{Yi^2) = Ij^o) = 0.5 and ^(^^(i) = 1, 
^i(2) = 0|so) = 0.25 = 0.5 X 0.5 by (A. 3) in the Appendix. Here the sample 
variables i^(2)} under the posterior distribution given sq inherit the 

independence of the y's, even if the design were SRSWR. 

If we selected sq = {1,1}, then l|(so)/i (sq) = 1) ^(^0)12(^0) =4(so)^i(so) = 
/i(so)/|(so) = and P(y,(i) = 1, y,(2) = 0|so) = 0. Here {^^(i), ^^(2)} are 
stochastically dependent under P(-|so)- 
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We next deal with a sequence of super-populations indexed by = 1, 2, . . . . 

Proposition 4.3 [Asymptotic normality under P{-\su)]. Let Y^i, i = 
1,...,N^, v>l, he i.i.d. r.v.s on {Q,^,P) with zero mean and finite vari- 
ance (7^ > 0. Consider SRSWOR samples Sy of size and Py = P{-\sy) 
as in (4.4). Let Y^i(j^-){su,u)) denote the array of r.v.s as in Definition 4.4. 

Then (o"^n,y)~^/^[X]fc=i ^i/i(A:)] converges in law to a standard normal random 
variable. 

The proof is in the Appendix. 

Proposition 4.4. Let B = U^gaI'^} x -^s G C{S) x ^ with all s distinct. 
We write 

(4.5) Pd,n{B\Sx^)(oj) = Y^p{s,uj)lFS^), wea 

Then the right hand-side of ( 4.5 ) is the conditional probability measure on 
{S X Q, C{S) X ^) given the a-field S x^. The result is also valid if we 
replace everywhere ^ by = a{Y^ ,Z^) or by a{Z^). 

An outline of the proof is given in the Appendix. 

5. Convergence in the product space and asymptotic independence. In 

this section we establish results that enable us to determine the limiting 
distribution of a combination of sample estimators and super-population 
statistics. Let ^ G be a sample estimator as in Remark 4.2. We define 

F(t, w) = p{{s e S : e{s, cj) < t}, w), te M^. 

Theorem 5.1. We consider a sequence of product spaces and sample 
estimators as in Definition 4.3 and Remark 4.2, indexed by i' >1. Let Xi^, 
A G be random vectors defined on [VL,^^P). We have: 

(i) Lf Xi, ^ X in the law of the model (P), then X,^ ^ X in the law of the 
product space. 

(ii) If Fy[t,uj) — > F{t,u>) in probability P for all points of continuity t € 
of F(t,io), then F{t,uj) is a bounded random variable in the model space, 

and the product-space distribution of 9^ converges to F{t) = Jq F{t, lu) dP{uj). 
In particular, if 9y{-,uj) is design- consistent a.s. uj, then it is consistent in 
the product space. 

(iii) Assume that X^ ^ X in the law of the model and Fy[t,uj) — > F[t) in 
probability P as v ^ oo for all points of continuity t G of F{t), where F[t) 
is a nonstochastic distribution function. Then the joint distribution function 
of {9y,Xy) converges to the product of the two limiting distributions. The 
random variables 9i, and Xy are said to be asymptotically independent. 
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The proof is given in the Appendix. Note that when the hmiting design- 
based distribution is normal with mean zero, we only require that the limit- 
ing variance be nonstochastic in the model. This last condition would follow 
if we imposed simple conditions in the super-population model, as we did in 
Proposition 3.1. 

Remark 5.1. The design-based distribution of the sample estimator 
9]^ (viewed as a random variable in the product space) is a version of its 
conditional distribution in the product space given S x ^n- This follows if 
we take sets of the form B{t) = {(s,a;) :6{s,u!) <t}, t E M^, in Remark 4.1 
and use (A. 5) in the Appendix. 

Example 5.1 (The ratio estimator of the finite population mean). We 
assume a one-stage super-population model composed of L disjoint strata of 
Nh i.i.d. r.v.s (Yhi, Zu), i = 1, . . . , Nu, with mean = EmiYhi) and variance 
0"! = Vm{Yhi), h = l,...,L. Let = jiJ2h=i^htJ'h be the parameter of 
interest, N = Ni + ■ ■ ■ + Nl and F^v = ;^ Y^h=i ^h^^h- The finite population 
mean is 

h=li=l 

Consider a stratified one-stage PPSWR design with the notation of Exam- 
ple 2.1. At each draw k = 1,. . . ,nfi, the units are selected in the sample Sh 
with probabilities p^i, which are functions of Zf^i, i = 1, . . . , N^, h = 1, . . . , L. 
The ratio estimator of the finite population mean is 

VR = (1/iV) ^ ^ Vhi/nhPhi, ^ = E E '^/^hPhi- 

Let n = ni-\- ■ ■ ■ + ni, n/j > 1 and N — > oo, n oo. We aim to obtain the 
asymptotic normality of \/n{yR — ^n) as — > oo. Here we construct a prod- 
uct space with the unconditional model probability measure P rather than 
Pz (defined after Definition 4.3). We decompose \/n{yR — fij^) into two terms, 
as in (1.1), and apply Theorem 5.1. The CLT for \/A {Y n — with limiting 
variance T^ = lim^r J2h=i ^h^fi^ Tm < oo, follows if we assume Liapunov's 
condition (Theorem 27.3 in [2]), 

L 

^ NhEm\Yhi - lih?^^ = o(A^^+'^/2pi+<5/2) as ^ oo, for some 5 > 0. 

h=l 

Let r^, the limiting design variance of y/n{yR — Y]\f), that is, = lim„(l/A) x 
(n/N) EhXEi eiii^) I 'nhPhi - e\{ij)/nh), be positive definite, where ehi{uj) = 
yhi{io) — Yiy{u;) and eh{io) = J2i ^hii^) are the residuals, i = 1, . . . , Nh, h = 
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1,. . . ,L. Note that C2 implies that is consistent. The CLT for \/n{yji — 
Yn) with asymptotic variance fohows as in [31] by assuming conditions 
(Ci) to (C3) in the Appendix apphed to the residuals of a first stage sam- 
pling design, where M = N, and by Slutsky's theorem. Theorem 5.1 can 
then be applied if we assume that T^i is nonstochastic. 

6. Sample estimators derived from an estimating equation (EE). In this 
section we describe a methodology to derive the asymptotic normality of 
the root 6]\f (^M.^ of the sample estimating equation when centred about the 
super-population parameter S We combine existing asymptotic results 
in both the design and super-population probability as in Theorem 5.1. 

Let {^},J^,P) and {Y^^ ,X^^ ,Z^'^) represent a super-population as in 
Definition 3.1 associated with a design space as in Definition 2.3. The first 
stage sample size is denoted by 11^. In what follows we omit the index v 
and we set N ^ 00, n ^ 00 as z^^oo. We first define a finite population 
estimating equation (EE) estimator and then an EE for the sample space. 

Definition 6.1. Let g represent a continuously differentiable function 
defined on RP+^+^. We consider functions of the form 

N 

(6.1) GNiO,u;) = [l/aiN)]Y,g(Yiiu;),Xiiu;),e), 

i=l 

where u; G 17, 6* G 5 G a{N)/N = 0(1) as iV ^ 00. A finite population 
EE is defined by 

(6.2) GN{0,io) = 0. 

A finite population EE estimator is defined as a solution 0jv of (6.2), when 
such a solution exists and is unique. For uj £0, fixed, 9n is a finite population 
parameter. 

Definition 6.2. Let 0^(9, lo) be a design-consistent estimator of Gn{9,uj). 
A sample EE is defined by 

(6.3) GN{9,io) = 0. 

A sample EE estimator 9^ is defined as a solution of the sample EE in (6.3). 

Yuan and Jennrich [30] (see also [3]) set general conditions for the exis- 
tence, strong consistency and asymptotic normality of EE estimators which 
require independent but not necessarily i.i.d. random vectors g(Yi, Xi,9), 
i = 1, . . . ,N . We can apply their results to clustered data models with clus- 
ter totals gi{9) = J2j=i9(Xij,Xij,9), which are stochastically independent. 
The cluster sizes Mi, i = l,. . . ,N, stay bounded as the number N of clusters 
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goes to infinity. Theorem 6.1 shows that the sample EE estimator (around 
the model parameter) is asymptotically normal in the law of the product 
space. Conditions 1-3 were given by Yuan and Jennrich [30] for the exis- 
tence and consistency of On and the asymptotic normality of \/iV(^Ar — ^o)- 
Conditions 1, 4 and 5 below imply the existence and design-consistency of 
9]y and the design-asymptotic normality of ^/n{6N — On)- 

Theorem 6.1. Consider a sequence of super-populations composed of N 
independent random vectors associated to design spaces as in Definition 3.1. 
Let f = lim.„ n/N > as n —> oo. Note that we do not require that / = 0. We 
assume the following conditions. 

1. Gn{Oo) with probability one. 

2. There is a compact neighborhood B{Oq) of Oq on which, with probability 
one, allGN{0) are continuously differentiable and the Jacobians dGN{0)/dO 
converge uniformly in to a nonstochastic limit J[0) which is nonsingu- 
lar at Oq. 

3. ^/NGn{Oo)^N{0, 

TjYi) ifi the law of the super-population. 

4. There is a compact neighborhood B{9q) of Oq on which dGN{0)/dO con- 
verge uniformly in the design probability to a nonstochastic (in design) 
limit which coincides with J{0) at Oq for almost every u; E Note that 
if Gn{0) = J2i£s '^idii^) ' independent of and {Gn{0), > 1} are 
continuously differentiable, then {Gn{0), iV > 1} are also continuously 
differentiable. 

5. \/nGNiON) =^ in the law of the design as n — > oo for almost 
every tj G 17, where the variance-covariance is nonstochastic in the 
super-population. Let J = J{Oq) and T = J^^[rrf -|- fTm\J~^ . Then we 
have, in the law of the product space, 

(6.4) y/^{ON-OQ)^N{^,T). 

Estimation of F from the sample data depends on the particular design 
under consideration for the estimation of T^, and on both the model as- 
sumed for the variance-covariance structure of the super-population and 
the sampling design for the estimation of Tm (see Example 6.1). The Jaco- 
bian matrix J = J{Oq) can be estimated consistently by [OGn /dO){ON)'. this 
follows from Assumptions 2 and 4 and the consistency of On [from (6.4)]. 
The proof of Theorem 6.1 is in the Appendix. 

Remark 6.1. Korn and Graubard [14] propose direct estimators of the 
variance-covariance of the sample mean under different super-population 
models and sampling designs. See also Rubin-Bleuer [21]. In Example 6.1 
we assume a two-stage super-population model and design to estimate F. 
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In Example 6.1, (i) we establish a model for the super-population variance- 
covariance structure so that we can estimate the model variance matrix 
from the sample data and (ii) we examine the design conditions for the 
asymptotic normality of ^/nGJ\^{6J\[) to hold in the finite population. 

Example 6.1 (General EE sample estimator under a stratified two-stage 
super-population model and design) . Assume the stratified two stage super- 
population model of Example 3.1 with the addition of the auxiliary informa- 
tion given by Xhij {h, i and j as in Example 3.1) and the two-stage design of 
Example 2.1. In this example, we construct separate consistent estimators 
within each stratum and so we need to assume that n/j — > cxd and — > oo, 
h=l,...,L. 

Let the finite population EE, given by 

^^-^ h=li=l 

with ghi{0) = J2j=l 9{Yhij , Xhij , 9) , satisfy the first three conditions of The- 
orem 6.1. Now let Gn{0) be the sample estimator of Gn{0)-, where Gn{G) 
replaces ^tv in Example 2.1. Assume M/iV^m<oo as A^— >oo. Also as- 
sume Hfi/Nh = Ch constant as — > oo, for all /i = 1, . . . , L. 

(i) Assume that the second stage observations ghij{Q) are i.i.d. r.v.s with 
means ^hi and variances a\^, j = 1, 2, . . . , M^j. Furthermore, {^hi,(^hi) are 
i.i.d. r.v.s, where the /i/^jS have model variances VmifJ-hi) =lh-, and the a\^s 
have model expectations Emicfii) = c^, i = 1, 2, . . . , A'^^, /i = 1, . . . , L. Thus, 

Vm{^GN{e)) = (N/M) ^ I Wnal + 7/^ ^l/M^ | 

(6.5) 

with Wh = Y. ^hi/M, h = l,...,L. 
1=1 

To obtain a (model) consistent estimator of Vm{VNGN{0)), it is enough 
to get consistent or asymptotically unbiased estimators of cr^ and 7/1, /i = 
1, . . . , L. These can be written as quadratic functions of the finite population 
values Qhij = ghij{S) and, thus, they are finite population parameters: 



and 

Nh r „ ^2 ^ /Nh 



giJMhi) UMu-I). h = l,...,L, 



1 f r ■ 1 ^ if 
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where cr^ and 7/1, h= 1, . . . ,L, are model unbiased, as well as model con- 
sistent. Any pair of design-consistent or asymptotically design- unbiased es- 
timators fj^ and 7/i of the respective finite population parameters and 
7/i can replace o"^ and 7/1, h = I, . . . , L, in (6.5) to yield an asymptotically 
unbiased estimator of Tm = lim^v Vm{VNGNiO)) in the product space. 

(ii) To obtain the asymptotic normality of \/nGN{6p{), we express Gi\f{9i\j) 
as the sum of n = ni + ■ ■ ■ + ul independent, zero mean random vectors 

Zhk{ON), 

with Wh = Mh/M, ZhkieN) = ghi{ON)ll/MM-Y.f=i 9hi{0N)/Mh, where 
ghi{(^N) denotes the second stage unbiased sample estimator of ghii^iy). 

The design-based CLT for y/nGi\f{9fyf) with positive definite = lim„ n x 
J2h=i ^hYdiZkiiG n)) / nh follows from conditions (Ci)-(C3) in the Appendix, 
with replaced by Zhki^N)- 

As in Proposition 3.1, one can give conditions in the super-population so 
that a Liapunov-type condition holds in the design space and the asymptotic 
design variance exists and is nonstochastic (condition 5 of Theorem 6.1). 
The super-population conditions required for the latter are more complex 
than those stated in Proposition 3.1, but they can be specified in the same 
way. We do not spell them out here. 

APPENDIX 

Yung and Rao [31] designs conditions for the asymptotic normality of the 
sample mean. 

(Ci) ni+^ELiEfcii^d|W^MI'+^ = 0(l) as n^oo, §1 as in Exam- 
ple 2.1. 

(C2) {n/ M) maxfi^ij mhiWhij = 0{1) as n — > 00, where nihi are the second 
stage sample sizes and Whij are the sampling weights. 

(C3) F^ (w) = nVd{ON) Frf positive definite as n — > 00, 9n as in Exam- 
ple 2.1. 

Proof of Proposition 3.1. Since if Em\X\ is finite, then |X(u;)| is 
finite a.s. cj, condition Mi implies (1/A^) ELi E£\ l^feiP^'' = 0(1) a.s. 
Lu. (C'l) follows from the boundedness of two terms once we take N = 2 
and p = 2 + 5 in the inequality ^|(1/A^) Ek=i ^fcl^ < iV^) ELi 
(see (7), page 95 of [5]). Since \Yh\^+^ = \Ed[9^]\'^+^ < Ed\e^\^+\ we only 
need to show that, for ah k = l,...,nh, h = 1, . . . ,L, Y^h=iWhEd\6^\'^~^^ = 
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0(1) a.s. u; as iV ^ oo. At the /c-draw we select one cluster, so E'^l^^p^^ = 
E£'ll^h^M/M;,i|2+V^ = E£'il>^iMI'+'M^"/"X"'- Since N<M, 
Y.LiWhEd\et?^^ < {^/N)ELlE^=l\Yh^\^^^, which is 0(1) when (Mi) 
holds. □ 

Proof of Remark 4.3. Under SRS we have Pd,m(Xi{k){si'-^) < «) = 
{l/N)Y.f^^P(Yi{io) <a),k = l,...,n.Foi n>2,k^l, = 1, ... ,n, under 
SRSWOR we have 

-Pd,m(>i(fc)(s,w) < a, yj(£)(s,a;) < b) 

^^'^^ = VWiN - 1)] p(y,(^) < a)p(y,(^) < b), 

i i^j 

and under SRSWR we have 

Pd,m{Yi(^k){s,Lv) < a,Yi(_i){s,uj) < b) 
(A.2) = {l/N')\Y,Y.nY^{u^) < a)P{Y,{u;) < b) 



+ Y,P{Y,{u)<mm{a,b))^. 



Under SRSWOR let P{Yi <a)= p{a) for all i = 1, . . . , N . Then Prf,^(y,(fc)(s, 
'a^) < a) = p(a), k = 1, . . . ,n, and the right-hand side in (A.l) is p{a)p{b), 
which proves pairwise independence in the product space. Overall indepen- 
dence is proved similarly. If the Y^s are not identically distributed, we show 
dependence via a counterexample. Let P(Yi < a) = pi and P{Yi < a) = p2 
for i = 2, 3, . . . , iV. If we take = 2, we have ^(^^(fc) < a) = [pi + P2]/2, 
A; = 1,2, and P(yj(i) < a, 1^(2) ^ o,) = PiP2- Independence holds only when 
pi = P2- Under SRSWR dependence in the product space follows from (A.2). 
For discrete y's, 

Pd,ni{yi{k){s,i^) = a, yj(£)(s,u;) = 6) 

(A.2') 



Proof of Proposition 4.2. For sq G 5*, /c / = 1, . . . ,n, we have, 
by Proposition 4.1, part (ii), 

N 

P(y,(fc)(s, w) < also) =Y.P(^i ^ «)^'(5o) 
1=1 
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and 
(A.3) 



P(Xi{k){s,i^) < a, yi(£)(s,a;) < b\so) 

N N 

i=ij=i 



In the WOR case we have Ij^(so)/f (so) = for every k £, i = I, . . . , N , and 
these terms disappear in the double sum, yielding independence. For samples 
sq ^ S for which I^{so)I^{sq) = 1 for some i, the double sum above contains 
nonzero terms where i = j. The terms corresponding to the repeated labels 
in the product of the two distributions are different from their counterpart 
terms in the joint distribution: 

N 

Y.P{Y,<a)P{Yi<b)inso)l!{so) 



1=1 



N 



/ ^ P(Yi < min(a, (so)/f (so) foi' continuous Y's 



and 



N 



1=1 



J2P{Y, = a)P{Y, = b)Iliso)lKso) 

N 

=^J2P(Yi = a,Yi = b) if (so) if (so) for discrete Y's. 



i=l 



□ 



Proof of Proposition 4.3. Under P{-\su) the y^j(fc), i = 1,. . . ,N^, 
>1, are i.i.d. r.v.s with mean zero and constant variance. That they are 
identically distributed like the original y's follows by (A.3), and indepen- 
dence follows from Proposition 4.2 for SRSWOR. As in Theorem 27.2 of [2], 
(27.9) holds, which implies the Lindeberg condition and proves the result. 
□ 



Proof of Proposition 4.4. We prove that, for each to gQ, the right- 
hand side of (4.5) is a probability measure on the product space, and that, for 
each measurable set B in the product space, it is a version of the conditional 
probability of B given S x ^ ([5], page 223). The first statement follows from 
the additivity of p and the u-additivity of the indicator functions. To prove 
the second, we note first that p{s,-) is ^-measurable. Then it suffices to 
show that, on the elementary rectangles B = sq x Fq, we have 

/ p{so,Lo)lF,{uj)dPd,m = Pd.m{Bn{SxF)), F G ^. 

JSxF 
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The left-hand side above equals J2ses lFnFoP^^o^^)p{s,^) dP by (4.2). 
By Definition 2.3(ii), the sum above equals 

/ p{so,uj)dP = Pd,n^i{so}x FnFo). p 

Proof of Theorem 5.1. Let t gR^. We first omit indexing the pop- 
ulations. Let 

B{t) = {is,uj):e{s,uj)<t}= |J{s} xF, 

ses 

with 

Fs = {ojen:e{s,oj)<t}e^. 

By Remark 4.1, Pd,n{B{t)) =j:,^s InPis^^)^Fs{^) dP- Note that 
(A.4) F{t,oj)=p{{s:9{s,io) < t},io) = 

s&S 

where A^^ = {s G S :9{s,u!) <t}£ C{S). For each {s,uj) the indicator func- 
tion of Fg coincides with the indicator function of A^^, 

(A.5) IfM = IaM)- 

Using (A.5) in (A.4) and the formula for Pd^rn{B{'t))-, we have 

Pd,m{B{t))= I F{t,Lo)dP and 
Jn 

(A.6) 

Pd,m{B{t)nE)= F{t,oj)dP, E^^. 
JnnE 

(i) This follows from P{\u <u) = Pd,m{S x {X^, < u}) and J2s&sP(.'^^ ^) — 
1 for all uj Gil. 

(ii) Fy[t,Lo) converges in probability to F{t,ij) at points of continuity t. 
Since < F^{t,(jj) < 1, the bounded convergence theorem (Theorem 16.5 of 
[1], page 180) implies 

(A. 7) [ F^{t,uj)dP-F{t)^0 aszv^oo. 

Jn 

(A.6) and (A.7) yield (ii), 

P^^{B^{t))= [ F^{t,uj)dP^F{t) asv^oo. 
Jn 
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(iii) Consider the indicator function Ii,{u,uj) = I{i^;\,^[u})<u}{'^)- Using 
(A. 6) with Ei,{u) = Sy X {ijj:\i,{uj) < u}, the joint distribution function of 
{Ou,Xu) can be expressed as 

P^rnKs, : {0^, K) < {t, U)} = Pd.m{Bu{t) X Ey{u)) = f Iy{u,Uj)F^{t,Uj) dP, 

Jn 

and if H denotes the distribution function of A, we have, for points of con- 
tinuity t, u of F and H, 

I Iu{u,uj) ■ Fy{t,u;)dP - F{t)H{u) 
Jn 

= [ h{u,u){Fy{t,u:)-F{t))dP + F{t) I {Iy{u,uj)-H{u))dP. 
Jn Jn 

All functions are bounded by one, so the first term of the right-hand side 
converges to zero by the bounded convergence theorem since, by hypothesis, 
Fu{t,-) — F(t) converges to zero in probabihty P at points of continuity t of 
F. The second term also converges to zero by hypothesis. □ 

Proof of Theorem 6.1. For simphcity we assume that f = n/N for 
aU n: 

(A.8) V^iON - Oo) = - On) + ^/l^iON - Oo). 

Assumptions 1-3 imply the asymptotic normality of the second term on 
the right-hand side of (A.8), in the law of the model (see [30]). This and 
Theorem 5.1(i) imply convergence in the law of the product space. Next we 
observe that On exists and 6^ — 9^ ^ in design probability as n — > oo 
for almost every w G 0. Indeed, the Gn{9) are continuously differentiable 
and design consistency implies that Gn{0) converges to G{9) (the limit 
of G]y{6) in [30]) in design probability. Hence, we can apply to Gn{0) the 
techniques of Theorems 1 and 2 of [30], and, thus, Assumptions 1 and 4 imply 
that On ^ 9q in the design probability. Since the above mentioned theorems 
also imply On ^ Oq a.s. in the model probability P, we have On — On ^ 
in the design probability a.s. w. Conditions 4 and 5 imply asymptotic 
normality of the first term in the right-hand side of (A.8). This, in turn, 
implies convergence in the product space, by Theorem 5.1(ii). The two terms 
in (A.8) are not stochastically independent in general. Theorem 5.1 (iii) and 
Assumption 5 imply the "asymptotic independence" of the terms and the 
asymptotic normality of the sum. □ 
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